A Document Repository for Social Media and Speech Conversations
نویسندگان
چکیده
We present a successfully implemented document repository REST service for flexible SCRUD (search, create, read, update, delete) storage of social media and speech conversations, using a GATE/TIPSTER-like document object model and providing a query language for document features. This software is currently being used in the SENSEI research project and will be published as open-source software before the project ends. It is, to the best of our knowledge, the first freely available, general purpose data repository to support large-scale multimodal (i.e., speech or text) conversation analytics.
منابع مشابه
Unleashing Corporate Communications: Social Media And Conversations With Customers
This paper reports on a small-scale study that explores the impact of social media adoption and use on corporate communications and brand image. The research is carried out in two phases; analysis of brand communications via social media; and a small-scale survey with social media managers. We draw upon Habermas’ theory of communicative action (the ideal speech situation and discourse ethics) i...
متن کاملClassification-based spoken text selection for LVCSR language modeling
Large vocabulary continuous speech recognition (LVCSR) has naturally been demanded for transcribing daily conversations, while developing spoken text data to train LVCSR is costly and time-consuming. In this paper, we propose a classification-based method to automatically select social media data for constructing a spoken-style language model in LVCSR. Three classification techniques, SVM, CRF,...
متن کاملRanking of Document Recommendations from Conversations using Probabilistic Latent Semantic Analysis
Any Information retrieval from documents is done through text search. Now a day, efficient search is done through Mining techniques. Speech is recognized for searching a document. A group of Conversations are recorded using Automatic Speech Recognition (ASR) technique. The system changes speech to text using FISHER tool. Those conversations are stored in a database. Formulation of Implicit Quer...
متن کاملHow do Twitter Conversations Differ based on Geography, Time, and Subject? A Framework and Analysis of Topical Conversations in Microblogging
Automatic discovery of how members of social media are discussing different thoughts on particular topics would provide a unique insight into how people perceive different topics. However, identifying trending terms / words within a topical conversation is a difficult task. We take an information retrieval approach and use tf-idf (term frequency-inverse document frequency) to identify words tha...
متن کاملNow Trending in Your Community: Social Media Insights For Your Public Health Mission
Introduction In today’s fast paced world, information is available (and expected) instantaneously. Social media has only fueled this expectation as it has permeated all aspects of our lives. More and more of the population is turning to social media outlets to share their thoughts and update their status, especially during disasters. With all these conversations occurring, it is only reasonable...
متن کامل